Search CORE

13 research outputs found

Forward Stochastic Reachability Analysis for Uncontrolled Linear Systems using Fourier Transforms

Author: Bracewell Ron
Dorato Peter
Kvasnica Michal
Lasota Andrzej
Manganini Giorgio
Stein Elias M
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/02/2017
Field of study

We propose a scalable method for forward stochastic reachability analysis for uncontrolled linear systems with affine disturbance. Our method uses Fourier transforms to efficiently compute the forward stochastic reach probability measure (density) and the forward stochastic reach set. This method is applicable to systems with bounded or unbounded disturbance sets. We also examine the convexity properties of the forward stochastic reach set and its probability density. Motivated by the problem of a robot attempting to capture a stochastically moving, non-adversarial target, we demonstrate our method on two simple examples. Where traditional approaches provide approximations, our method provides exact analytical expressions for the densities and probability of capture.Comment: V3: HSCC 2017 (camera-ready copy), DOI updated, minor changes | V2: Review comments included | V1: 10 pages, 12 figure

arXiv.org e-Print Archive

Crossref

Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning

Author: Alberto Maria Metelli
Angelo Damiani
Giorgio Manganini
Marcello Restelli
Publication venue: PMLR
Publication date: 01/01/2022
Field of study

We propose a novel formulation for the Inverse Reinforcement Learning (IRL) problem, which jointly accounts for the compatibility with the expert behavior of the identified reward and its effectiveness for the subsequent forward learning phase. Albeit quite natural, especially when the final goal is apprenticeship learning (learning policies from an expert), this aspect has been completely overlooked by IRL approaches so far. We propose a new model-free IRL method that is remarkably able to autonomously find a trade-off between the error induced on the learned policy when potentially choosing a sub-optimal reward, and the estimation error caused by using finite samples in the forward learning phase, which can be controlled by explicitly optimizing also the discount factor of the related learning problem. The approach is based on a min-max formulation for the robust selection of the reward parameters and the discount factor so that the distance between the expert’s policy and the learned policy is minimized in the successive forward learning task when a finite and possibly small number of samples is available. Differently from the majority of other IRL techniques, our approach does not involve any planning or forward Reinforcement Learning problems to be solved. After presenting the formulation, we provide a numerical scheme for the optimization, and we show its effectiveness on an illustrative numerical case

Archivio istituzionale della ricerca - Politecnico di Milano

Following Newton direction in Policy Gradient with parameter exploration

Author: Bascetta Luca
Manganini Giorgio
Pirotta Matteo
Restelli Marcello
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper investigates the use of second-order methods to solve Markov Decision Processes (MDPs). Despite the popularity of second-order methods in optimization literature, so far little attention has been paid to the extension of such techniques to face sequential decision problems. Here we provide a model-free Reinforcement Learning method that estimates the Newton direction by sampling directly in the parameter space. In order to compute the Newton direction we provide the formulation of the Hessian of the expected return, a technique for variance reduction in the sample-based estimation and a finite sample analysis in the case of Normal distribution. Beside discussing the theoretical properties, we empirically evaluate the method on an instructional linear-quadratic regulator and on a complex dynamical quadrotor system

Archivio istituzionale della ricerca - Politecnico di Milano

A classification-based approach to the optimal control of affine switched systems

Author: Manganini Giorgio
Piroddi Luigi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper deals with the optimal control of discrete–time switched systems, characterized by a finite set of operating modes, each one associated with given affine dynamics. The objective is the design of the switching law so as to minimize an infinite–horizon expected cost, that penalizes frequent switchings. The optimal switching law is computed off–line, which allows an efficient online operation of the control via a state feedback policy. The latter associates a mode to each state and, as such, can be viewed as a classifier. In order to train such classifier–type controller one needs first to generate a set of training data in the form of optimal state–mode pairs. In the considered setting, this involves solving a Mixed Integer Quadratic Programming (MIQP) problem for each pair. A key feature of the proposed approach is the use of a classification method that provides guarantees on the generalization properties of the classifier. The approach is tested on a multi–room heating control problem

Archivio istituzionale della ricerca - Politecnico di Milano

Policy Search for the Optimal Control of Markov Decision Processes: A Novel Particle-Based Iterative Scheme

Author: Manganini Giorgio
Piroddi Luigi
Pirotta Matteo
Prandini Maria
Restelli Marcello
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

A majority voting classifier with probabilistic guarantees

Author: Falsone Alessandro
Manganini Giorgio
Prandini Maria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This paper deals with supervised learning for classification. A new general purpose classifier is proposed that builds upon the Guaranteed Error Machine (GEM). Standard GEM can be tuned to guarantee a desired (small) misclassification probability and this is achieved by letting the classifier return an unknown label. In the proposed classifier, the size of the unknown classification region is reduced by introducing a majority voting mechanism over multiple GEMs. At the same time, the possibility of tuning the misclassification probability is retained. The effectiveness of the proposed majority voting classifier is shown on both synthetic and real benchmark data-sets, and the results are compared with other well-established classification algorithms

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

A data-based approach to power capacity optimization

Author: Falsone Alessandro
Jan Siroky
Manganini Giorgio
Prandini Maria
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Optimal control to reduce emissions in gasoline engines: An iterative learning control approach for ECU calibration maps improvement

Author: Caporale Danilo
Deori Luca
Falsone Alessandro
Giulioni Luca
Manganini Giorgio
Mura Roberto
Pirotta Matteo
Vignali RICCARDO MARIA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Control of emissions in gasoline engines has become more stringent in the last decades, especially in Europe, posing new and important problems in the control of complex nonlinear systems. In this work a preliminary investigation is conducted on the idea of exploiting Iterative Learning Control to optimize calibration maps that are commonly used in the Engine Control Unit of gasoline engines. In this spirit, starting from existing maps, we show how to refine them using a gradient-descent iterative learning control algorithm, considering additional constraints in the optimization problem. The outcome of this procedure is a control signal which can be integrated in a modified map. The performance of the proposed technique is validated on the provided training signal and cross-validated on different reference signals. Simulation results show the effectiveness of the approach

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Analysis of Different Strategies for Lowering the Operation Temperature in Existing District Heating Networks

Author: Francesco Neirotti
Giorgio Manganini
Michel Noussan
Stefano Riverso
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

District heating systems have an important role in increasing the efficiency of the heating and cooling sector, especially when coupled to combined heat and power plants. However, in the transition towards decarbonization, current systems show some challenges for the integration of Renewable Energy Sources and Waste Heat. In particular, a crucial aspect is represented by the operating temperatures of the network. This paper analyzes two different approaches for the decrease of operation temperatures of existing networks, which are often supplying old buildings with a low degree of insulation. A simulation model was applied to some case studies to evaluate how a low-temperature operation of an existing district heating system performs compared to the standard operation, by considering two different approaches: (1) a different control strategy involving nighttime operation to avoid the morning peak demand; and (2) the partial insulation of the buildings to decrease operation temperatures without the need of modifying the heating system of the users. Different temperatures were considered to evaluate a threshold based on the characteristics of the buildings supplied by the network. The results highlight an interesting potential for optimization of existing systems by tuning the control strategies and performing some energy efficiency operation. The network temperature can be decreased with a continuous operation of the system, or with energy efficiency intervention in buildings, and distributed heat pumps used as integration could provide significant advantages. Each solution has its own limitations and critical parameters, which are discussed in detail

Directory of Open Access Journals

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)